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Abstract 

Lattice rounding in Euclidean space can be viewed as finding the 
nearest point in the orbit of an action by a discrete group, relative 
to the norm inherited from the ambient space. Using this point of 
view, we initiate the study of non-abelian analogs of lattice rounding 
involving matrix groups. In one direction, we give an algorithm for 
solving a normed word problem when the inputs are random products 
over a basis set, and give theoretical justification for its success. In 
another direction, we prove a general inapproximability result which 
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essentially rules out strong approximation algorithms (i.e., whose ap¬ 
proximation factors depend only on dimension) analogous to LLL in 
the general case. 

Keywords: lattice rounding, matrix groups, norm concentration, Lya¬ 
punov exponents, word problems, inapproximability. 


1 Introduction 

Given a basis {a*}” =1 of a lattice L C M" and a vector y e M n , the Lattice 
Rounding Problem (lrp) in Euclidean space asks to find argmin ||z — y (( 2 , 

z£L 

that is, a vector z G L nearest to y. This problem is very closely related to the 
lattice basis reduction problem of finding a good basis for L, which informally 
is to find another basis {&i}” =1 for L whose elements are as orthogonal as 
possible. The motivation is that given such a good basis {6,;}" =1 , lrp may 
be easy. To wit, if L = Z n a good basis is trivial to find, and lrp can be 
solved by coordinate-wise rounding. For general L and bases {aj}’ l =1 , one 
has NP-hardness results for exact and approximate versions of lrp nia. 
and their study is an active area of research. 

The presumed hardness of these problems also has led to constructions 
of cryptosystems. This typically involves three main ingredients: 

(a) Good Basis. Generation of a basis {bi}™ =1 for L that is good in the 
sense that LRP is easy relative to it on inputs randomly chosen from 
some distribution v. 

(b) Bad Basis. Generation of a suitable matrix M e SL n ( Z) such that 
lrp with respect to v is hard relative to the basis {etj}" =1 , where a* = 
Mb.i. 

(c) Public Key System. One keeps the good basis as the private key and 
the bad basis as a public key, and designs an encryption or signature 
scheme such that an attack on it would entail solving lrp relative to a 
bad basis. 

This paper presents a non-abelian generalization of lattice rounding, and 
some steps in the direction of ingredients (a) and (b). Our generalization 
starts with the viewpoint of M n as an additive abelian group and L as a 
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discrete subgroup: LRP is equivalent to finding the nearest point to z (in the 
ambient metric) to the orbit of the origin under the action of L. This view¬ 
point can be extended to a larger class of groups, and spaces upon which 
they act. For example, one could consider a Lie group such as the n x n 
invertible matrices G = GL n (M), and a discrete subgroup T; this direction 
quickly leads to rich mathematical theory connected with dynamics and au- 
tomorphic forms. In this case one could choose ambient metrics on G related 
to a variety of matrix norms. 

Another direction is to consider the action of G on some space X endowed 
with its own metric. For example, G = GL n (W) acts on the vector space 
X = M n or even the projective space RP n_1 by the usual multiplication of 
vectors by matrices. Let V as before denote a subgroup of G. A non-abelian 
analog of lattice rounding asks to find the closest point in the T-orbit of a 
fixed vector in M n , where the closeness is measured using some natural metric 
on vectors (but not on matrices, although we do make a restriction on word 
length for practical reasons). 

Alternatively, if F and X are themselves endowed with a discrete struc¬ 
ture (e.g., T consists of integral matrices and A" consists of integral vectors), 
we can instead study the problem of recognizing elements of a T-orbit. To ad¬ 
dress items (a) and (b) above, is natural to ask if one can develop analogous 
positive algorithms for rounding with good bases and, conversely, negative 
results for general subgroups V in GL n (R). One naive approach would be to 
modify a generating set {gi, ..., g r } by successively replacing a generator g % 
by giSfj-, where j 7 ^ i and cfZ. In the abelian case such repeated modifica¬ 
tions generate any change of lattice basis. However, in the non-abelian case 
there are some geometric constraints (such as course quasi-isometry) which 
may at times dull the effects of such a change. We do not investigate this 
direction here. 

I 11 Section 3 we consider the Word Problem on Vectors (13.311 . for which 
we propose the Norm Reduction Algorithm (13.4p . The analysis of the latter 
leads to well-studied mathematical and algorithmic topics. For example, 
multiplying random elements of T times a fixed vector can be viewed as a 
generalized Markov chain (using more than one matrix); the growing vector 
norms of these products is itself a generalization of the law of large numbers 
(the case of n = 1). Additionally, the conditions for the success of our Norm 
Reduction Algorithm depend on an analog of the spectral or norm gap in 
Markov chains: it requires instead a gap between Lyapunov exponents (see 

(BSD)- 
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Some remarks on our generalization 

The generalization of LRP from lattices L in M" to finitely-generated sub¬ 
groups T = ( S) in GL n (R) is neither unique nor straightforward. Here we 
seek to make a distinction between our norms and the word-length metric, 
since the latter already appears in the existing literature in combinatorial 
group theory and the study of special groups (e.g., braid groups [S]) from 
algorithmic and cryptographic points of view. We informally outline a few 
issues that guide our formulation. 

Full (or at least large) dimensionality: We would like our discrete 
subgroups to not be contained inside some subgroup of much smaller dimen¬ 
sion of the ambient group. In R n one typically assumes the lattice L has full 
rank, or least has relatively large rank. Its natural matrix analogue is to re¬ 
quire the Zariski closure of T = ( S ) be the full group (or at least correspond 
to a subgroup having a significant fraction of the dimension of the full group). 
By definition, this means that the full group is the only group containing S 
which can be defined as the common zeroes of a set of polynomial equations. 
This ensures T = ( S ) is non-abclian in as general way as possible. 

For example, if S has only diagonal matrices it cannot generate any non- 
abclian group, and its Zariski closure is at most an ri-dimensional subgroup of 
the n 2 -dimensional group GL n (R). In fact, by considering commuting diago¬ 
nal matrices one can embed subset-sum type problems and get NP-hardness 
results. Note that matrices composed of 2 x 2 blocks along the diagonal can 
generate non-abclian groups that essentially describe simultaneous problems 
in dimension 2; nevertheless, the Post Correspondence Problem can be em¬ 
bedded as a word problem over 4x4 matrices with 2x2 blocks, proving 
the undecidability of the latter {TB]. However, certain problems can actually 
become easier in the non-abclian setting: for example, finding the order of a 
random element in S n is much easier than in (Z/pZ)*. 

Metrics: The distinction between the word length metric and ambient 
matrix norm is discussed in some detail in Section 2 below. The former 
depends on the generating set S. In general these can be very different 
notions of distance, which makes our study difficult - yet is key to potential 
cryptographic applications. We use the Furstenberg-Kesten theory [6|l71fl3] 
of random matrix products to correlate the two (in a probabilistic sense) in 
certain situations, which is analogous to the “good basis” situation described 
in (a) above. 

Finite co-volume and compactness If L has full rank, then L\K n 
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is a compact, finite-volume quotient. However, neither property necessarily 
extends to the quotients T\G in many important examples of T and G. Thus 
we do not impose this requirement. Some further comments are given just 
below in the beginning of the following section. 

Outline of this paper 

Section 2 contains some background about different metrics on Lie groups 
and their discrete subgroups. Section 3 introduces the statements of the 
word problems that motivate our results, as well as the Norm Reduction 
Algorithm (13.41) . which is rigorously analyzed in Theorem 14.11 The Closest 
Group Element Problem is also given in section 3, along with the statement 
of its inapproximability result Theorem 13.11 The analysis of the Norm Reduc¬ 
tion Algorithm is performed in Section 4 using results in dynamical systems. 
Some experimental results on the algorithm are also presented in Section 4.5. 
The proof of Theorem 13. II is given in Section 5; it demonstrates a polynomial 
time reduction from the Traveling Salesman Problem. 

We would like to thank Anthony Bloch, Hillel Furstenberg, Nathan Keller, 
Peter Sarnak, Adi Shamir, Boaz Tsaban, and Akshay Venkatesh for their 
helpful comments. 

2 Background 

Just as a lattice L = (ap,..., a n ) is additively generated by its basis {a;}, 
the subgroups V = (gi,...,gk) we consider will be finitely generated. A 
crucial difference, however, is that the quotient of M n by L is a compact 
n-dimensional torus with finite volume under the usual Lebesgue measure 
on R n (for example, the quotient Z n \M n ). However, this fails to be true 
for nice examples such as GL n (Z)\GL n (M.) or even SL n {Z)\SL n (R ), both 
of which are noncompact under the natural group invariant metric inherited 
from G (the latter quotient, however, does have finite volume). The theory 
and construction of both compact and noncompact discrete subgroups of Lie 
groups involves numerous beautiful subtleties (see [15j[2i]); we do not restrict 
ourselves to these objects in this paper. 

There are two natural notions of size in T, and by extension to the T-orbit 
of any basepoint x G X: 
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1. Word length metric: If S — {gi,.. ., g k } is a generating set of T as 
above, any element w G T can be expressed as a finite word in the 
alphabet S U S' -1 . There may be many possibilities for such a word, 
taking into account relations amongst the g t (including the trivial rela¬ 
tion gig- 1 = 1). The minimal such length among all such expressions 
is the word length of w with respect to S. 

The ability to efficiently compute the word length of w enables one 
to efficiently write it as a minimal length word, simply by successively 
checking which of the expressions g^w reduces the word length by 
one. Finding the word length depends of course on the generating 
set S, which is analogous to the basis of a lattice. In analogy with 
ingredients (a), (b), and (c) above for Euclidean lattices, we want the 
word length to be difficult for typical generating sets S of T, yet at the 
same time easy for some “good bases” S] moreover, we would like to 
be able to transform each “good base” into a seemingly bad one. 

2. Inherited metric: Fundamental to lattice reduction and rounding is the 
notion of metric on the ambient space. Natural metrics on G and X 
therefore can be used to give generalizations of lattice rounding. Com¬ 
bining this with word length results in problems such as the following: 
given I G N, V C GL n (R), and vectors y and z G M n , find 7 G T such 
that || gy — z || 2 is minimized over all 7 6 T with word length at most 
t. Thus the length parameter l is used to complement (rather than to 
duplicate) the ambient metric. 

Though we do not present any cryptographic systems here, generaliza¬ 
tions of attacks on existing cryptosystems motivate studying rounding prob¬ 
lems in more general settings than lattices in M” alone. With some per¬ 
formance enhancing additions, the lattice reduction algorithm LLL [T2] has 
long become a valuable tool in cryptanalysis HU, and typically is more ef¬ 
fective than the provable guarantees attached to it indicate alone. Starting 
with the original attack of Shamir |2T], some very effective attacks have been 
discovered. The attacks are often based on the Shortest Vector Problem in 
lattices: given a basis for L, find a nonzero vector in L with minimal norm. 
In polynomial time, the LLL algorithm finds a vector within a factor of 2 n//4 
of being the shortest, a strong bound - i.e., one which depends only on the 
dimension of the lattice, and not on the sizes of the entries in the lattice 
basis themselves. Babai’s rounding algorithm [2] - which is based on LLL 
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- also has this feature for solving lattice rounding problems in Euclidean 
space. The fact that this bound depends only on the dimension is crucial for 
attacks. 

In contrast, we prove in Theorem 13.11 that the analogous question of 
rounding products of matrices cannot have a polynomial time strong approx¬ 
imation algorithmic - unless P=NP. This is done by creating a polynomial 
time reduction to the Traveling Salesman Problem, which has a similar in- 
approximability result. Thus a strong approximation algorithm like LLL for 
rounding in matrix groups is unlikely to exist. 

3 Some non-abelian problems and an algo¬ 
rithm 

We study problems that arise out of group actions on norrned spaces, where 
we are concerned with the action of group elements that have short expres¬ 
sions relative to a given basis or generating set. We now proceed to formally 
define these problems and state some known results. 

We shall work with GL d ( R), the group of all invertible d x d real matrices, 
and often with subsets that have integer entries. Given gi,... ,gk G GL d (M.), 
we consider the possible products of these matrices up to a certain length 
bound, and whether or not they can be recognized as such. The word problem 
is the algorithmic task of representing a given matrix in this semigroup as a 
product of the generators: 


Word Problem 

INPUT : Matrices cq, ..., and x G GL d (R). 

Output : An integer l > 0 and indices 1 < si, ..., sn < k such 
that g sl g S2 • • • g St = X, if such a solution exists. 


This word problem is known to be unsolvable when d > 4 uni however, 
there is an algorithm for specifically constructed generators when d = 2 [9j 
(the case of d = 3 is open). It becomes NP-hard for d > 4 if we bound the 

^^where the approximating factor is a polynomial time computable function of the di¬ 
mension. 
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word length t, as we do for all our problems in the rest of the paper: 


Bounded Word Problem 

INPUT : An integer L > 0, and matrices gi,...,gk and x G 
GL d (R). 

Output : Indices 1 < s±,..., sl < k such that g sl g S2 ''' 9s L = 
x, if such a solution exist. 


(3.2) 


This problem can be modified to allow for words of length < L. 

We now define another related problem, in which the matrices act on 
vectors: 


Word Problem on Vectors. 

Input : An integer L > 0, matrices g\... ,gk G GL d (R) with 
integer entries, and nonzero vectors v,w G 
Output : An integer £ < L and indices 1 < si,...,sg < k 
such that g sl g S2 • • • g se v = w, if such a solution exists. 


(3.3) 


Typically we are interested in instances where £ = L and the indices Sj are 
chosen independently and uniformly at random from the above interval. Us¬ 
ing the ambient norm on Euclidean space, we present the following algorithm 
for this problem: 


Norm Reduction Algorithm: 

Let j = 0, and t be a fixed parameter. 

repeat 

3=3 + 1 

Sj = argmin||gfN 1 ro|| 

i 

w = g~ x w 

until w = v or j = L — t. 

Solve for SL-t+ i, •.., sl by exhaustive search. 


We include the option of exhaustive search for the final t steps in case the 
algorithm performs worse on smaller words than on larger ones. Another 
possibility is to use a memory-length look-ahead algorithm such as in [ T9l §7]. 
The Norm Reduction Algorithm is rigorously analyzed in the next section, 
where it is related to a maximal likelihood algorithm. Its success depends 




























on some mild yet complicated conditions on generators gi,... ,g k that come 
from dynamics. Theorem 14. II in the next section gives a rigorous upper bound 
on the error probability of this algorithm. We give a successful numerical 
example in Table Q] in Section l4~5l along with how Theorem 14. lf s constants 
pertain to it. 

One can also define a related rounding problem, whose analysis and al¬ 
gorithms are quite similar. Instead, we will focus on the following matrix 
rounding question: finding a short word in a semigroup closest to a given one 
(with an length constraint imposed for practical reasons). 


Closest Group Element Problem (cgep) 

Input : A positive integer L > 0, and matrices gi,... ,g k and 
z G GL d (R). 

Output: The closest word of length < L in the g t to z. 


(3.5) 


Though the problem can be stated for various notions of distance, we will 
use the sum-of-squares matrix distance 


( a ij ) {bij) 


I a ij bij I 


(3.6) 


in studying this problem. 

Our main result about the CGEP problem is the following negative result, 
which comes close to ruling out the existence of an algorithm such as LLL 
that approximates the closest element up to a constant factor depending only 
on the dimension. In the following we denote by CGEP(gi ,..., g k , z, L) the 
solution to the CGEP problem as above. 

Theorem 3.1. Let f : Z>o —> [1, oo) be a polynomial time computable func¬ 
tion. If there exists a polynomial time algorithm A which, given the input of 
a CGEP problem as in US. 5 1) . always outputs a word w' of length < L in the 
gi such that 


\W-z\\ < f(d)\\CGEP( gi ,...,g k ,z,L) - z ||, (3.7) 

then P = NP. 

It is an interesting open problem whether or not the approximation factor 
can instead depend on the sizes of the entries. 
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4 Maximum Likelihood Algorithms 

In this section we give and analyze a simple algorithm to solve the Word 
Problem on Vectors (13.3ft : try to reduce the norm at each step, or put dif¬ 
ferently, attempt to use the norm as a proxy for word length. This involves 
studying some background from dynamics related to random products of ma¬ 
trices, first studied by Furstenberg and Kesten m- Our results are sensitive 
to certain conditions related to the generators, which we describe before stat¬ 
ing our result. These are discussed thoroughly in the book [13], which serves 
as a general reference for background material on the topic of this section. 
In addition, several of the techniques and arguments in this section are taken 
from [[13]. 

Let S = {gi,...,gk} denote a finite subset of G — GLd(M), and T = 
(S) the semigroup it generates. Throughout this section we will use ||g|| to 
denote the operator norm of a matrix g. We make the following standing 
assumptions on the set S throughout this section: 

Al. T is contracting in the sense of [3} Definition III. 1.3]. This means that 
T has a sequence of matrices Mi, M 2 ,... such that M n /||M n || converges 
to a rank 1 matrix. It is readily seen (using Jordan canonical form) that 
this condition holds automatically if S (or even T) contains a matrix 
with an eigenvalue strictly larger than its others in modulus. 

A2. T is strongly irreducible: there is no finite union of proper vector sub¬ 
spaces of W l which is stabilized by each element of T. Equivalently, 
the same statement holds with T replaced by the group generated by 

S ([31 p. 48]). 

A3. The operator norms ||g“ 1 g i ||, j ^ i, are all at least some constant 
N > 1. 

We prove the following result about the probability of success of the 
Norm Reduction Algorithm (13.41) . This gives a strong indication (along with 
numerical testing) that norm reduction is a suitable algorithm for solving the 
Word Problem on Vectors (13.3j) . It is also often possible to show that the 
group generated by S is free by deriving a quantitative version of the well- 
known Ping-Pong Lemma. We do not address these issues in this version of 
the paper. 


10 







Theorem 4.1. Let S = {g\,... ,gk} be a fixed, subset ofGL d (R) andv a fixed 
nonzero vector in 7h d . Assume properties Al-3. Then there exists positive 
quantities a, B, and C such that if h is a random product of length L 
elements of S, the Norm Reduction Algorithm \3.f\ ) recovers v from hv (i.e., 
solves the Word Problem on Vectors \3.A) ) with probability at least 

1 - C(L-t) (|S| - 1) N~ a , 

where N is as defined in assumption A3 and the parameter t in the algorithm 
is taken to be at least B log N. 

Roughly speaking, the algorithm succeeds for long enough words when 
the operator norms ||^ 1 ^ i || are themselves sufficiently large. Though the 
constant N is readily computable from the generating set S, the numerical 
values of C and a are unfortunately more subtle. We are unable to rigorously 
prove that C is reasonably small, or that a is somewhat large. (It is not clear 
that these statistics of (S) are even computable in general; see [5l l20ll22] .) In 
particular, one cannot directly take N —» oo to get the above error estimate 
to decay to zero, without possibly simultaneously affecting a. However, in 
concrete examples of generating sets it is possible to make heuristic estimates 
of the values of N and a from the proof. We give such an example in 
Section 14.51 in which numerical estimates for these constants give a small 
error probability in Theorem 14. 11 Our experiments on this example are vastly 
better: the algorithm was successful in nearly all trials we tested for L < 1000 
(see Table [□) . 

4.1 Motivation for the algorithm and its analysis 

Recall the Word Problem on Vectors (13.311 . in which the matrices in S are 
assumed to be integral. One is given L G N and vectors v and w = hv G Z d , 
where h is an unknown word of length at most L in S'; the problem is to 
fold some word h' of length at most L in S such that w = h'v. Were we 
to have a concrete description of u as a product /A, where / is an easily 
computable function, we could attempt to solve for h using the following 
maximum likelihood algorithm: 

2 I.e., h = gi x ■ ■ ■ gi L where ii,.. . Al are each chosen independently and uniformly from 

{i,•••Al¬ 


ii 








Idealized Algorithm: 

Let j — 0 

repeat 

3=3 + * 1 


f(g 1 ui) 

Sj = arg max i ' / . 

J „• hi w|rI detail 


w = g Sj l w 

until w = v or j = L. 


Recall the notation arg max denotes a value of i which maximizes the ex- 

i 

pression it precedes. The particular expression here represents the change 
in local density under the map w > g^w. The numerator accounts for the 
difference between v and A, while the denominator represents the change in 
the uniform measure A. If successful, the algorithm produces b! as g Sl gs 2 ■ • •, 
possibly reconstructing h. However, it is impractical to assume that / is eas¬ 
ily computable. Because of this limitation, we instead use the simpler, more 
practical Norm Reduction Algorithm (13.4[) . It is tantamount to pretending 
/ equals 1 and that the matrices have determinant 1, meaning that we seek 
to minimize ||g^ 1 u'|| at each stage. 

In effect, the Norm Reduction Algorithm (13.4p uses the norm as a height 
function, and proceeds by descent to shorten the word length of h each time. 
Of course, a direct way to measure the word length would be preferable. 
The relationship between word length and matrix norm has been studied by 
several authors, e.g., mm- 

To study the distribution of elements of T and their orbits in we need 
to define some measures. We let g = gs denote the Dirac measure of S on 
G, meaning that it gives mass |4 to each element. Given two measures g±, 
g 2 on G, their convolution is denned as the unique measure gi * g 2 satisfying 


[ f(x)dg 1 *g 2 (x) = I [ f(xy)dg 1 (x)dg 2 (y) for all / e C(G ), (4.1) 
Jg Jg Jg 

the continuous functions on G. To simplify notation we sometimes write 
gi * g 2 simply as gig 2 ] for example the n-fold convolution of g with itself 
will be denoted as g n (it is the measure giving mass |5'| _ " to each product 
of n elements taken from S, allowing repetitions). We can also define the 
convolution of g with any measure p on MP d_1 : g* p is the unique measure 
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satisfying 

f f{x)dp*p(x ) ~ [ [ f{Mx)dp{x)dp{M ) for all / G C^MP 6 * -1 ). 

J rp 1 *- 1 Jg Jwp d_1 

(4.2) 

To be concrete, we identify measures on MP d 1 with measures on the unit 
sphere in that are invariant under the antipodal map. Typically the 
uniform measure A on MP ^” 1 is not stabilized by convolution with p, unless 
the matrices in S are orthogonal. However, there exist measures v on MP d_1 
which are /^-invariant: 

p, * v = v (4.3) 

(see [ 6 j Lemma 1.2]). Under certain conditions more can be said about u, such 
as its regularity properties. This measure is not always uniquely determined 
by S, but assumptions A 1 and A 2 however guarantee the uniqueness of the 
/i-invariant measure in our setting (see [3] Theorem III.4.3.(iii)]). 

The main step in the proof of Theorem 14.11 involves estimating measures 
of the subsets of vectors in MP d_1 which get contracted by the operators 
gJ 1 Qi- Indeed, let pj equal the probability that the algorithm obtains the 
wrong value for g s at the j-th step. One has that pj = | ~}2 i<k Pji, where p Jt 
is the probability of error in the j-th step, conditioned on the correct answer 
equaling g t . In terms of the measure 5 V , the Dirac measure of v £ MP^” 1 , 
this probability can be computed as 

Pji = p : ~ 1 8 v (B i ), 

where p^~ l 8 v denotes /i J_1 * 5 V and 

Bi = {16 MP d_1 | 3r ^ i such that ||5' f T 1 5 , * x ll < ll x ll } 

O r^i -B^^i , 

with 

B r ,i = {x <G MP d_1 | Wg^gixW < \\x\\}. 

Thus the error probability in Theorem 14.11 is 

L ^ ^ 

Prob Error < ^ Pj = J, ^22 < - 22 P 3 ~ 1 dv{.B^ i ). 

j=t+l t<j<L t<j<L 

l<i<k 1 <r^i<k 

(4.7) 

The proof therefore amounts to estimates on /U l 5 v (B ri ), which are given in 
the following subsections. 


(4.4) 

(4.5) 

(4.6) 
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4.2 Lyapunov Exponents 

In the remainder of this section, we shall need some technical results and con¬ 
cepts from the literature on random products of matrices. For the reader’s 
convenience we have chosen to cite background results in the book [3] wher¬ 
ever possible, while at the same time attempting to correctly attribute the 
original source of the results. The top two Lyapunov exponents 71 , y 2 of S 
are defined through the following limits (see B p. 6]): 

71 = lim f E{log||h|| I h e 5"} = - [ logllMlIdp” 

n ->00 n n Jq 

7i + 72 = lim — E{log || A 2 h\\ \ h G S'™} = — [ log || A 2 M\\ d/jJ 1 , 

n->cxD n n Jq 

(4.8) 

where A 2 g is the operator on A 2 R d given by x A y 1 — > gx A gy and || • || 
denotes the operator norm (the general Lyapunov exponents are likewise 
defined inductively through higher exterior powers). Not only do these limits 
exist, but in fact a theorem of Furstenberg and Kesten [7j asserts that the 
individual terms in the above sets are close to those limits with probability 
one as n —>■ 00 . Under assumptions A1 and A2 one has separation between 
these top two Lyapunov exponents: 


7i > 72 


(4.9) 


(|3, Theorem III.6.1]). We remark that computing or even approximating 
the Lyapunov exponents is in general difficult [22]. 

We shall use the following variant of (14.81) . which involves the action of a 
random product on MP d_1 . 


Proposition 4.2. (Furstenberg [ 6 ]; see [3], Corollary III.3.4.(iii)].J Under 
assumption A2 one has that 


— Ejlog \\hx\\ \ he S n } 
n 

uniformly for x G MP d_1 . 



(4.10) 


Consequently, 


lim sup — 

x ^o n 



\\Mx\ 


d[i n 


7i • 


14 


x 


(4.11) 






Following P, p. 55] we use the natural angular distance 


8{x,y) 


x A y|| 

z|| Ill/ll 


L (x, y)~ 

V IMI 2 llf/ll 2 


(4.12) 


which is a metric on MP d 1 . It satisfies the following estimate: 

Proposition 4.3. (See J3J Proposition 111.6.4(h)]. ) For any x, y e MP^ 1 , 

lirri sup — / log f S (M x ,My) \ < 72 - 7i < 0 . (4.13) 

n->oo n Jq \ o{x, y) ) 

Proof. By (Q 2 jl 

S(Mx,My ) = ||M(a;Ay)|| ||a;|| ||y|| 

<5(z,J/) II® A 2/|| ||Mx|| ||Mj/|| 

< II a 2 M \\——- 

- 11 "\\Mx\\\\My\\ ’ 


1 , S(Mx, My) 


1, M 2 71 r„ 1, \\Mx\\ 1, \\My\\ 

< - log || A 2 M|| - -log^^ - -log^^i 
n n ||x|| n ||?/|| 


(4.14) 

The proposition follows by integrating this inequality over M, and appealing 
to (14. 8 D and (14. 1 Oh . □ 


4.3 Cocycle integrals 

We have just seen that the integrand 


s(M, (x,y)) 


, S(Mx, My) 

log W^T 


(4.15) 


in (14.13P tends to be negative on S n . Our next goal is to show that the 
integral of an exponential of it is accordingly smaller than 1. Writing z as 
shorthand for (x,y), define 


S(n) = sup [ e“ s(M;Z) d/j n (M ), (4.16) 

2 JG 

which exists for any a > 0 since S is finite. It is proven in [3J p. 104] that 


S(n + m) < S(n)S(m), 


(4.17) 
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using the cocycle identity 


s(gi 92 ,v) = s(g 1 ,g 2 v) + s(g 2 ,v) (4.18) 

and a simple change of variables. According to |3j Lemma III.5.4], any matrix 
M G G satisfies the inequality 


log | A 2 M\\ < 2 e(M), 


(4.19) 

where 



£(M) = maxjlog | M||, log ||M _1 

1 , 0 }. 

(4.20) 

it follows from (14.141) that 



s(M,z ) < log || A 2 M\\ + 21og||M _1 | < 

4 £{M). 

(4.21) 

If Umax denotes rna x{£(g)\g G S}, then 



s{M,z ) < 4 n£ max 


(4.22) 


on S n = the support of /A, independently of z. 


Proposition 4.4. (See [13} Theorem 1] and [3J Proposition V.2.3].^ For 
a > 0 sufficiently small, there exists no > 0 and p < 1 such that 


f ( 5(Mx,My) 

Jg\ 8( x ,y) 


dg n (M) 


< P n 


(4.23) 


for all x y e WP d 1 , and n> n 0 . 


Proof. The inequality 

e x < 1 + x + yel*l 

and (14.22(1 imply that 

e as(M,z) < 1 + as{M,z) + 8a 2 n 2 £ 2 max e 4nai ™* (4.24) 


for M G S n . Thus the lefthand side of (14.2311 . which is the integral of 
e" s ( M, 2 )d/i n (M) over G, is bounded by 


1 + a [ s(M,z)dp n (M ) + 8 a 2 n 2 f^e 4ra< “. (4.25) 

Jg 
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Proposition 14.31 asserts that for any e > 0 there exists n' sufficiently large so 
that 

sup [ s(M, z) dy n (M) < n(y 2 — 71 + e) (4.26) 

z JG 

for all n > n 1 , and so 

S(n) < 1 + na{ 72 - 71 + e) + 8 a 2 n 2 l 2 max e inaimax (4.27) 

for such n. In particular, if e and a are sufficiently small, the righthand side of 
(I4.26P is negative and S(n') < 1. Repeated applications of the subadditivity 
property (14.1711 show that S{kn' + m) < S(n') k S(m) for 1 < m < n', which 
implies the proposition. □ 


4.4 Estimate on /i J 1 S v (B r ^) 

This subsection contains the mathematical core of the argument, a Holder 
estimate relating the measures ^~ l 5 v and v. For any e > 0 and closed subset 
U C MP d_1 , define a function / = f £ ,u on by 

f{x) = max |l — — , 01 . (4.28) 


Proposition 4.5. For 0 < a < 1 the function f satisfies the bound 


I f(x) - f(y) I 

8(x,y) a 


(4.29) 


uniformly in x,y G MP 0 ' 1 


Note: the expression on the lefthand side of (14. 5 p appears in [T3J p. 106], 
where it is use to create a Banach space norm. 


Proof. The result is immediate if either x and y are both in U, or both 
distance at least e from U\ likewise it is immediate if one of them lies in 
U and the other lies distance at least £ from U . We may therefore assume, 
without loss of generality, that 0 < S(x, U) < e. 

If y G U, the quotient equals £~ 1 5{x, U) 1-01 < e~ a . If 0 < 5(y,U) < 
£, I f(x) ~ f(y)\/8(x,y) Q = £- 1 \8{x,U) - 6(y, U)\/8(x, y) Q < e _1 |5(x, £7) - 
8(y, f/)| 1_ " < £~ a , using the inequality 


\S(x,U) - 5(y,U)\ < 5(x,y). (4.30) 
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For the remaining case S(y,U ) > e we again use (I4.30p to deduce | f{x) — 
f(y)\/8(x,y) a = £~ l \e — S(x, U)\/8(x, y) a < £ _1 |e — 8(x, U )| 1_a < e~ a . □ 


Proposition 4.6. (See |3] p. 107],) Consider the function f defined in terms 
of the set U and constant e > 0 in H4-28[ ). For a sufficiently small, there 
exists no > 0 and p < 1 such that 



f{Mv)dn n (M) 


for all n > no- 



f(y)dv(y) < £->" 


(4.31) 


Proof. In fact, the present argument shows this inequality holds when the 
lefthand side of (14 .31 ft is replaced by its absolute value, though we shall not 
need this. After u by pA * v — v in the second integral, the lefthand side 
equals 


f{Mv)dp n {M ) 


f(My)dp n (M)dv(y) 


'G 


>G 


< 


< 


f(Mv)dp n (M)dis(y) - \ 

J RP ^- 1 

(f(Mv)-f(My))dp n (M)dv(y) 

I/O) - f(y )I 


f(My)dp n (M)dv(y) 


IG 


1 g \x£ S(x,y) 


<G 


5(Mv, My) a dp n (M)dv{y) 


l/O)-/O)0 f 8(Mv, My)^ t \ 

,a v7„ p «(*,»)• ) (M)dv(v) ’ 


(4.32) 

the last inequality holding because <5(-, •) < 1. The result now follows from 
Propositions 14.41 and 14.51 □ 

We will eventually apply this to sets containing the D ri from (14.6ft . which 
are all of the form 

{x e MP d_1 | ||Ac|| < ||x||} (4.33) 

for some A e GL(d, R) of norm greater than 1. Given such a matrix A, let 
w = wa 6 l d be a unit vector such that ||7ku|| = ||A||. 


Proposition 4.7. ||7Lr|| < ||a;| 


(x,w)Aw || < ||x| 
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Proof. Let z be a vector perpendicular to w. For all tel we have that 
\\A(w + tz)\\ 2 ||Au;|| 2 + 2 t{Aw,Az) + f 2 ||As|| 2 


\\Aw\Y > 


\w + tz\\ 2 




+ t 2 \\ z \ 


(4.34) 


and so this last expression must have a local maximum at t — 0. In particular, 
its f-derivative at t — 0 must vanish, i.e., ( Aw,Az ) = 0. Therefore if a 
vector x 6 l d is decomposed as x = {x,w)w + z for some z _L w, then 
Ax = A(x,w)w + Az is again an orthogonal decomposition. It follows that 
||Ar|| > ||v4(a;, w)w\\, proving the proposition. □ 


We now return to bounding ^~ 1 6 v ( y B i ) in order to get an error estimate 
in (14.71) . The sets B r i are of the form (14.33j) . with A = gf 1 gi . We now £x r 
and i. By Proposition 14.71 

B rii C U = |x G MP ^” 1 | < PH -1 j • (4-35) 

Proposition 14.61 now shows that 

< y ] ~ l 5 v {U) < [ f(y)du(y) + £~ a p j ~ 1 , (4.36) 

J KP ^- 1 

where e > 0 is arbitrary and / = f e> u is the function (14.281) . The last integral 
is bounded by v(U'), where 

U' = {x e | 8{x, U) < e} 

= {x G MP ^” 1 | 3 y with 8(x,y ) < £ and ^ ^ ^ 

11 1/ II 

Here w, as above, represents a unit vector such that ||Au;|| = ||H||. Using 
(14.121) . this last condition on | (y, w ) | can be restated as 8(y, w) > y/l — ||H|| -2 . 
U' is in turn contained in the set 

U" = {i 6 MP d_1 | S(x, w) > y/l-\\A\\~ 2 - £} 

= {x £ MP d_1 | < y/l- (yi^pjp-^) 2 } (4 ' 38) 

I \ X II 

by the triangle inequality. 

We now quote a result of Guivarc’h and Raugi (see [31 Theorem VI.2.1]) 
which immediately implies a bound on the ^-measure of U" through the 
Chebyshev inequality. The comments in the proof of this Theorem on |3] 
p. 156] indicate that the exponent a has the same source as the one in 
Proposition 14.41 above, and thus may be taken to have the same value. 
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Theorem 4.8. (Guivarc'h and Raugi) Under assumptions A1 and A 2 there 
exists constants a > 0 and K > 0 such that 



dv(x) < K 


(4.39) 


uniformly in y. 

Applying the Chebyshev inequality to this with y — w, one gets 

f f(y)du(y) < v(U") < K (l-(\/l-P ||- 2 -£) 2 )” /2 . (4.40) 
Jmp d ~ 1 V 7 


Therefore using (I4.36j) and assumption A3, we bounded the ProbError prob¬ 
ability from ( 1471 ) by 


Prob 


Error _ 


< l J2 K (l - (y/l - N- 2 - e) 2 y /2 + e 


t<j<L 

l<r^i<k 


-y - 1 


(4.41) 


The expression inside the large parentheses is 


f- 2 -e 2 + 28^1 - 4 < 4 + 2£ . 

We now specify £ to be so that the error is bounded by 

ProbError < \ Y, N ~* + • (4-42) 

t<j<L 
1 <r^i<k 

Take t — [1 + log ^ 3 l0 g^ —-], so that 

K2 a N~ a > £- a p>~ 1 for j > t (4.43) 

and 

ProbError < t ■ (L - t)k(k - 1) • 2 a+1 KN~ a . (4.44) 

k 

This completes the proof of Theorem 14.11 
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4.5 Numerical Examples 

Example 1: where Norm Reduction works well 

We now present an example of the algorithm in practice, for dimension d — 3 
and the generating set S = {gi, g 2 , ( 73 }, where 


9 1 


( -9 

-59 

30 \ 


/ 444 

-31 

-363 \ 

11 

66 

-32 

to 

-no 

7 

90 

V 3 

21 

-11/ 

V -1271 

90 

1039 J 


and g 3 


9 31 33 \ 

—91 -303 -310 . 

-35 -116 -118 ) 

(4.45) 


These matrices were chosen randomly among those with integral entries in a 


L 

Number of Attempts 

Number of Successes 

2 

10,000 

10,000 

10 

10,000 

9,998 

50 

10,000 

9,978 

100 

10,000 

9,963 

200 

10,000 

9,936 

1,000 

1,000 

1,000 


Table 1: Numerical results with generating set S from (14.4511 . 

bounded range. In all our tests we ran the algorithm with the parameter t = 
0, i.e., not allowing for brute force search for the final steps. The parameter 
N in this example is ~ 12157.1. We ran several numerical trials of the Norm 
Reduction Algorithm (13. 4 p on the Word Problem on Vectors (j3.3[) with the 
vector v = (1,0,0), almost all of which were successful (see Table [1]). 

The error term (14.44ft is bounded by the one given in Theorem 14.11 if C 
is taken to be 4 K. In this typical example, the invariant measure v and 
its approximations /A * S v are supported near the eigenvectors for the g t 
corresponding to their maximal eigenvalue. Recall that the constant K comes 
from the measure of the set U", which in (14.38(1 is related to points in MP 2 
which have 5-distance very close to 1 from the direction of maximal stretching 
of the six matrices g~ x gi- We computed that these 18 pairs of 5-distances 
range between .33 and .98, far from 1 on the scale of 1/N. Since C can be 
large only if these distances are much closer to 1, we concluded that C is 
small - under some heuristics, we computed its value to be below 7. 

To estimate the value of a, we recall its origin in Proposition 14.41 comes 
from bounds on the quantities S(n) (14.161) . We numerically estimated that 
5(2) < .83 for a = 0.4. This was done by approximating that maximum 
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using a mesh. While that is no guarantee of an accurate estimate for the 
maximum, it is worth noting that the values to be maximized were typically 
much smaller. Also, using S(n) for larger values of n would result in a better 
estimate for a. With this value of a = .4, the probability in Theorem 14.11 is 
less than 1 only for small values of L — t. However, that estimate is certainly 
an overestimate for other reasons: for one thing, the proof estimates the 
error probability at each step, and multiplies this individual estimate by the 
number of steps to obtain the final estimate. The actual error probability 
is likely to be far smaller. The combination of this potential to improve 
the estimates, along with the excellent performance of the Norm Reduction 
Algorithm (j3.4j) in practice, demonstrates its usefulness in attacking the Word 
Problem on Vectors ( 1-1. ■ill . 

Example 2: where Norm Reduction does not work well 

The algorithm does not perform well when one of the generators is orthogo¬ 
nal. In this example we take S' = {g[, # 2 , < 73 }, where g[ = ^0 1 0 j and 5 % g 3 

are as defined in (I4.45p . With this one change (but otherwise the same condi¬ 
tions as in Example 1) the outcomes were much worse, and are summarized 
in Table [2j 


L 

Number of Attempts 

Number of Successes 

2 

10,000 

10,000 

10 

10,000 

4,404 

50 

10,000 

86 

100 

10,000 

2 

200 

10,000 

0 

1000 

1000 

0 


Table 2: Numerical results with generating set S'. 


5 Rounding and the Traveling Salesman Prob¬ 
lem 

In this section we show how algorithms to solve the Closest Group Element 
Problem (13. 5p can be easily converted to solve the Traveling Salesman Prob- 
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lem (tsp), and in particular prove Theorem 13.11 

Definition 5.1. Traveling Salesman Problem (on graphs). Given a complete 
graph on n vertices whose edges have positive integer weights, find a Hamil¬ 
tonian cycle which has minimal total weight (i.e., sum of its edge weights). 

The above formulation is more general than the metric TSP problem, in 
that the edge weights do not need to obey the triangle inequality. The 
TSP problem is NP-hard, as is the simpler problem of finding a Hamiltonian 
cycle whose total weight is within a constant factor of the minimum [ 251 
Theorem 3.6]. 

We shall now describe how to convert any instance of tsp into a Clos¬ 
est Group Element Problem (13.51) . First we set some notation for the tsp 
problem. Let w e = Wij = Wji be the weight of the directed edge e = ( i,j ) con¬ 
necting the i-th and j'-th vertices. Let m be an a priori lower bound for the 
total weight of the shortest Hamiltonian cycle (for example, m can be n times 
the lowest edge weight), and M be an upper bound (for example, the weight 
of any Hamiltonian cycle). Let mo denote the minimal total weight, which 
is unknown (and hence which we do not use in setting parameters). Since 
the weights are positive integers, one may of course assume that m 0l m > 1. 
The edge weight unit can be rescaled without affecting the solution to the 
TSP problem: accordingly we shall replace the above parameters by mT, MT , 
and moT ,where T > 0 is a parameter that will be chosen later. After this 
rescaling, one has that 


(5.1) 


any cycle weight less than m 0 T + T is minimal. 


In particular, there is no loss of generality in assuming that M > m 0 + 1. 
Given an edge e = let v e denote the row vector of length n which has 

all zeroes except for IP’s in positions i and j, where K is a parameter that 
will be chosen later. Let denote the n x n matrix which has all 0 entries 
except a 1 in the (z, j)-th position. Let (3 > a > 0 be parameters (to be 
specified later), and M e = M. \j — al + /3Eij. We set d = 2n + 3 and define 
d x d matrices for each directed edge by 



(5.2) 


(the blocks in this matrix are of sizes n, 1, n, 1, and 1, respectively; we have 
as well used the convention that blank entries are zero). Note that E t] ^ Ej i: 
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and consequently M tJ ^ Mj t and ^ g ]t . The Zariski closure of the group 
(or semigroup) generated by {g t j\i ^ j} contains GL n (R), embedded into 
the nx n block in the upper left corner, and satisfies the large dimensionality 
constraint of Section 1. 

If hi,... ,ht are all square matrices of the same size, let Y\ i<t hi denote 
the product hi - ■■ h e . If e\, ..., are edges, then 

/ I\Mer \ 

5 , ei5 l e2 ' ' ' 9et ( Inxn I • (5-3) 

\ i y> eT . / 

We shall now see how features of this matrix are related to the total weights of 
Hamiltonian cycles. First of all, ^2 r<e v er equals [2 K ... 2 K\ (i.e., a vector 
of all 2 /i’s) if and only if the edges ei,...,e^ touch each vertex exactly 
twice. The entry Yh r< i w ^r is of course the total weight of the path, if indeed 
ei,..., e r trace out a path. The product 

n*, = ni»'+w = v ■■*“->?“+■ ••+=<n Eir > 

r<£ r<£ (£i,...,e^)e{ 0 ,lp r<£ 

(5.4) 

helps detect such a path. The last product is zero unless the edges e r for 
which e r — l trace out a connected path; if they do, the product equals E VJ , 
where i is the first value of i r for which e r = 1 and j is the last value of 
j r for which e r — 1. Note that if a = 0, the only nonzero term is the one 
for £i — e 2 — ■ ■ ■ — £e — 1 : then the product Y\ r <^M er = if the 

edges 61 , 62 , ■■■ ,et trace out a connected path, but is zero otherwise. Thus 
in the extreme case a = 0 , tracing out a connected path is equivalent to the 
nonvanishing of this product. Unfortunately, however, the matrices are only 
invertible if a > 0. We will mainly be concerned with the case of a > 0 
because of its relevance to the Closest Group Element Problem (13.51) . but 
include some comments about the a = 0 case as well. In fact, the extra 
parameters a and f3 are needed simply to adapt features of the simpler a = 0 
case to noninvertible matrices. 


Proposition 5.2. The ( i,j)-th entry of Y\ r<f M er satisfies the bound 



< a ft~ l 2 e 


(5.5) 
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if the edges e\, e 2 , ..., en do not trace out a path, and 

(n^e,) - P e 5 i=il 5 j=ji < a^~ l 2 l (5.6) 

W / tj 

if they do. 

Proof. In either case, the expressions to be bounded are the matrix entries 
of the sum on the righthand side of (15.411 . except for the term corresponding 
to £\ = £2 = ■■■= S£ = 1 (which only comes up in (15. 6ft anyhow). The 
matrix entries of a product of E ir j r are all < 1, so the sum is bounded by 
(a + P) e -P l < aP i ~ 1 2 i . □ 

Let e > 0 be a parameter (which will be specified later). The Closest Group 
Element Problem (13.511 derived from this TSP instance is the following, as¬ 
suming a > 0 (if a = 0, it is the verbatim rounding problem for semigroups): 


Find the closest product of length < n of the gf s to the matrix 

TEn+el. 

1 O&G..O IS \ 

in the matrix norm 


z = 


1 2K-2K 

In X n 


1 0 
1 , 


(5.7) 


The block structure of the matrices allows us to compute the distance of a 
product in terms of the features described after (15.31) : 


fT'" - 


r<£ 


H M e r - P nE ll - £l n 


r<£ 


+ 


£ 

r<t 


[2 K ■ ■ ■ 2 K] 


^Wer) > ( 5 - 8 ) 

r<£ / 


where we again stress that || ■ || refers to the norm (13. 6 j) for the rest of this 
section. 


Proposition 5.3. (Note that £ = n in parts (A) and (C).) 

(A) If the edges e\, e 2 ,..., e n trace out a Hamiltonian cycle starting and 
ending at the first vertex, then 


J\M er - p n Eu 


2 


£ In 


< 0 naP n ~ 1 2 n y + ( nef 


(5.9) 
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and consequently 


r<n 


2 


Z 


< (na(I n 1 2 n ) 2 + ( ne ) 2 + (weight of cycle) 2 . (5.10) 


(B) If the edges e\, e 2 ,..., eg do not touch each vertex exactly twice, then 


11 ' 


2 


Z 


> K 2 . 


(5.11) 


(C) If the edges e\, e 2 ,..., e n do not trace out a path beginning and ending 
at vertex 1, then 


\[M er - (3 n En - el n 

2 

> 

TI^I -p n -z 

r<n 


\r<n J X1 


> (e + p n - a/3"-^") 2 . (5.12) 

Proof. The inequality (15. 9 p in part (A) is an immediate consequence of (15.61) 
and the triangle inequality. It then implies (15. lOj) because the middle term 
on the righthand side of (j5.8[) vanishes when the path enters and exists each 
vertex exactly once. 

On the other hand, failure to touch each vertex exactly twice means one of 
the vector entries for the middle term in (I5.8j) will be at least K , showing that 
the righthand side of (15.lip is at least K 2 (in fact by parity considerations 
it will be at least 2K' 2 ). This demonstrates part (B). Part (C) is likewise a 
consequence of Proposition 15.21 □ 

Proposition 5.4. Suppose 

1. (na/3 n ~ 1 2 n ) 2 + (ne) 2 < mT 2 

2. K > MT 

3. e + /3 n - a /3 n " 1 2 n > MT. 

Then any word of length < n in the g e closest to z has the form g ei g e2 • • • g en , 
where the edges e\, e 2 ,..., e n trace out a Hamiltonian cycle of shortest total 
weight that begins and ends at the first vertex. 
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Proof. We shall use all three parts of the previous Proposition. Part (A) and 
property 1 imply that if e±, 62 , ■ ■ ■, e n is the shortest Hamiltonian cycle and 
hi = g ei 9 e 2 ■■■gem then (EH} implies 


hi — z || 2 < (na/3 n ^ 1 2 n ) 2 + (ne ) 2 + (m 0 T ) 2 

< mT 2 + m 2 0 T 2 < T 2 (m 0 + 1) 2 , (5.13) 


because m < m 0 . 

Part (B) and property 2 imply that a path which does not touch each 
vertex exactly twice has 


M 2 T 2 < 


2 


n 9er - Z 

r<i 


(5.14) 


Since we have assumed M > mo + 1, the word Yl r<e g er cannot be closest to 
z. In particular, the closest word to z must be a product of length exactly 
n (otherwise the edges it is formed from do not touch each vertex exactly 
twice). Part (C) and property 3 likewise show that the edges of the closest 
word trace out a path beginning and ending at 1. 

Thus the closest word comes from a Hamiltonian cycle. We now must 
show that it comes from the Hamiltonian cycle of lowest total weight. Indeed, 
suppose that h = \\ r<n g er comes from a Hamiltonian cycle and \\h — z|| < 
||hi — z ||. By (15.8} and (15.13} we must have 

(total weight of h’s path) 2 < ||/r — z|| 2 < ||/?! — -^ll 2 < T 2 (m 0 + l) 2 , 

(5.15) 

and property (15.11) shows that this path is minimal - a contradiction. □ 

Proposition 5.5. Suppose edges ei, ■ ■ ■, e n trace out a Hamiltonian cycle 
starting and ending at the first vertex, and whose total weight is < moTA 
for some A > 1 (that is, within a factor A of being minimal). Suppose 
furthermore that 

{nafi n ~ l 2 n ) 2 + (ne ) 2 < (mAT ) 2 , (5.16) 

which is a consequence of the first assumption of Proposition \5.J\ since m,A> 
1. If h — 9 ei 9 e 2 ' ’' 9 e n (respectively, h') is the word formed from this cycle 
(respectively, a minimal cycle), then 

\\h — z\\ < y/ 2 A\\ti-z\\. (5.17) 
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Proof. By (15.101) one has 

\\h-z\\ 2 < (naf3 n ~ 1 2 n ) 2 + (ne) 2 + (weight of path) 2 < 

m 2 A 2 T 2 + m 2 0 A 2 T 2 < 2 (m 0 AT) 2 . (5.18) 

The result follows because \\h! — z\\ > m 0 T by (15.8p . □ 

The conditions of the previous Propositions can be achieved with matrix 
entries that are polynomially-sized in the input of the tsp instance. For 
example, the following parameter choices are easily checked to satisfy them. 

Proposition 5.6. Properties 1, 2, and 3 of Proposition \5If\ as well as 15.161) 
hold under the following parameter choices. 

(i) a = l, p = max(2 n+4 ,4M 2 , ), e = i T = f3 n ~ x ' 2 , and K = 

MT. In this case the matrices g e all have determinant 1. 

(n) T = 1, K = M, (3 = (2M) 1 /", a = and e = In this 

case the matrices all have determinant a n (and are hence invertible). 

(in) a = 0, (3 = M 1 A j e = 0, K = M, and T — 1. In this case the 
matrices are not invertible. 

Since the entries in these matrices and z are polynomially sized, Theo¬ 
rem [3J] then follows immediately from Proposition 15.51 and the corresponding 
inapproximability of the Traveling Salesman Problem on graphs |23l Theo¬ 
rem 3.6]. 
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